operational data
EcBot: Data-Driven Energy Consumption Open-Source MATLAB Library for Manipulators
Heredia, Juan, Schlette, Christian, Kjærgaard, Mikkel Baun
Existing literature proposes models for estimating the electrical power of manipulators, yet two primary limitations prevail. First, most models are predominantly tested using traditional industrial robots. Second, these models often lack accuracy. To address these issues, we introduce an open source Matlab-based library designed to automatically generate \ac{ec} models for manipulators. The necessary inputs for the library are Denavit-Hartenberg parameters, link masses, and centers of mass. Additionally, our model is data-driven and requires real operational data, including joint positions, velocities, accelerations, electrical power, and corresponding timestamps. We validated our methodology by testing on four lightweight robots sourced from three distinct manufacturers: Universal Robots, Franka Emika, and Kinova. The model underwent testing, and the results demonstrated an RMSE ranging from 1.42 W to 2.80 W for the training dataset and from 1.45 W to 5.25 W for the testing dataset.
- North America > Costa Rica > Heredia Province > Heredia (0.41)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- Europe > Denmark > Southern Denmark (0.04)
Toward Physics-Informed Machine Learning for Data Center Operations: A Tropical Case Study
Wang, Ruihang, Cao, Zhiwei, Zhang, Qingang, Tan, Rui, Wen, Yonggang, Leung, Tommy, Kennedy, Stuart, Teoh, Justin
--Data centers are the backbone of computing capacity. Operating data centers in the tropical regions faces unique challenges due to consistently high ambient temperature and elevated relative humidity throughout the year . These conditions result in increased cooling costs to maintain the reliability of the computing systems. While existing machine learning-based approaches have demonstrated potential to elevate operations to a more proactive and intelligent level, their deployment remains dubious due to concerns about model extrapolation capabilities and associated system safety issues. T o address these concerns, this article proposes incorporating the physical characteristics of data centers into traditional data-driven machine learning solutions. We begin by introducing the data center system, including the relevant multiphysics processes and the data-physics availability. Next, we outline the associated modeling and optimization problems and propose an integrated, physics-informed machine learning system to address them. Using the proposed system, we present relevant applications across varying levels of operational intelligence. A case study on an industry-grade tropical data center is provided to demonstrate the effectiveness of our approach. Finally, we discuss key challenges and highlight potential future directions. He data center (DC) industry is experiencing rapid growth driven by the increasing demand for cloud computing, data storage, and artificial intelligence (AI) services. This expansion is also occurring in tropical regions, where digital infrastructure is being scaled up to meet regional computing needs [1]. As DCs grow in size and complexity, their power consumption increases accordingly, particularly in tropical climates where high ambient temperature and humidity place additional strain on the cooling systems. According to the International Energy Agency (IEA), global DC energy consumption could rise to 1050 TWh by 2026, up from 460 TWh in 2022 [2].
- Information Technology > Services (1.00)
- Energy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Risk Analysis of Flowlines in the Oil and Gas Sector: A GIS and Machine Learning Approach
Chittumuri, I., Alshehab, N., Voss, R. J., Douglass, L. L., Kamrava, S., Fan, Y., Miskimins, J., Fleckenstein, W., Bandyopadhyay, S.
This paper presents a risk analysis of flowlines in the oil and gas sector using Geographic Information Systems (GIS) and machine learning (ML). Flowlines, vital conduits transporting oil, gas, and water from wellheads to surface facilities, often face under-assessment compared to transmission pipelines. This study addresses this gap using advanced tools to predict and mitigate failures, improving environmental safety and reducing human exposure. Extensive datasets from the Colorado Energy and Carbon Management Commission (ECMC) were processed through spatial matching, feature engineering, and geometric extraction to build robust predictive models. Various ML algorithms, including logistic regression, support vector machines, gradient boosting decision trees, and K-Means clustering, were used to assess and classify risks, with ensemble classifiers showing superior accuracy, especially when paired with Principal Component Analysis (PCA) for dimensionality reduction. Finally, a thorough data analysis highlighted spatial and operational factors influencing risks, identifying high-risk zones for focused monitoring. Overall, the study demonstrates the transformative potential of integrating GIS and ML in flowline risk management, proposing a data-driven approach that emphasizes the need for accurate data and refined models to improve safety in petroleum extraction.
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.50)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
ACCEPT: Diagnostic Forecasting of Battery Degradation Through Contrastive Learning
Sadler, James, Mohammed, Rizwaan, Castle, Michael, Uddin, Kotub
Modeling lithium-ion battery (LIB) degradation offers significant cost savings and enhances the safety and reliability of electric vehicles (EVs) and battery energy storage systems (BESS). Whilst data-driven methods have received great attention for forecasting degradation, they often demonstrate limited generalization ability and tend to underperform particularly in critical scenarios involving accelerated degradation, which are crucial to predict accurately. These methods also fail to elucidate the underlying causes of degradation. Alternatively, physical models provide a deeper understanding, but their complex parameters and inherent uncertainties limit their applicability in real-world settings. To this end, we propose a new model - ACCEPT. Our novel framework uses contrastive learning to map the relationship between the underlying physical degradation parameters and observable operational quantities, combining the benefits of both approaches. Furthermore, due to the similarity of degradation paths between LIBs with the same chemistry, this model transfers non-trivially to most downstream tasks, allowing for zero-shot inference. Additionally, since categorical features can be included in the model, it can generalize to other LIB chemistries. This work establishes a foundational battery degradation model, providing reliable forecasts across a range of battery types and operating conditions.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- Energy > Energy Storage (1.00)
- Electrical Industrial Apparatus (1.00)
Usage-Specific Survival Modeling Based on Operational Data and Neural Networks
Holmer, Olov, Krysander, Mattias, Frisk, Erik
Accurate predictions of when a component will fail are crucial when planning maintenance, and by modeling the distribution of these failure times, survival models have shown to be particularly useful in this context. The presented methodology is based on conventional neural network-based survival models that are trained using data that is continuously gathered and stored at specific times, called snapshots. An important property of this type of training data is that it can contain more than one snapshot from a specific individual which results in that standard maximum likelihood training can not be directly applied since the data is not independent. However, the papers show that if the data is in a specific format where all snapshot times are the same for all individuals, called homogeneously sampled, maximum likelihood training can be applied and produce desirable results. In many cases, the data is not homogeneously sampled and in this case, it is proposed to resample the data to make it homogeneously sampled. How densely the dataset is sampled turns out to be an important parameter; it should be chosen large enough to produce good results, but this also increases the size of the dataset which makes training slow. To reduce the number of samples needed during training, the paper also proposes a technique to, instead of resampling the dataset once before the training starts, randomly resample the dataset at the start of each epoch during the training. The proposed methodology is evaluated on both a simulated dataset and an experimental dataset of starter battery failures. The results show that if the data is homogeneously sampled the methodology works as intended and produces accurate survival models. The results also show that randomly resampling the dataset on each epoch is an effective way to reduce the size of the training data.
- Europe > Sweden > Östergötland County > Linköping (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)
SCANIA Component X Dataset: A Real-World Multivariate Time Series Dataset for Predictive Maintenance
Kharazian, Zahra, Lindgren, Tony, Magnússon, Sindri, Steinert, Olof, Reyna, Oskar Andersson
This paper presents a description of a real-world, multivariate time series dataset collected from an anonymized engine component (called Component X) of a fleet of trucks from SCANIA, Sweden. This dataset includes diverse variables capturing detailed operational data, repair records, and specifications of trucks while maintaining confidentiality by anonymization. It is well-suited for a range of machine learning applications, such as classification, regression, survival analysis, and anomaly detection, particularly when applied to predictive maintenance scenarios. The large population size and variety of features in the format of histograms and numerical counters, along with the inclusion of temporal information, make this real-world dataset unique in the field. The objective of releasing this dataset is to give a broad range of researchers the possibility of working with real-world data from an internationally well-known company and introduce a standard benchmark to the predictive maintenance field, fostering reproducible research.
- Europe > Sweden > Stockholm > Stockholm (0.05)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- Europe > Italy > Veneto > Venice (0.04)
- Asia (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
- Information Technology > Data Science > Data Mining > Anomaly Detection (0.55)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Maintenance Techniques for Anomaly Detection AIOps Solutions
Poenaru-Olaru, Lorena, Karpova, Natalia, Cruz, Luis, Rellermeyer, Jan, van Deursen, Arie
Anomaly detection techniques are essential in automating the monitoring of IT systems and operations. These techniques imply that machine learning algorithms are trained on operational data corresponding to a specific period of time and that they are continuously evaluated on newly emerging data. Operational data is constantly changing over time, which affects the performance of deployed anomaly detection models. Therefore, continuous model maintenance is required to preserve the performance of anomaly detectors over time. In this work, we analyze two different anomaly detection model maintenance techniques in terms of the model update frequency, namely blind model retraining and informed model retraining. We further investigate the effects of updating the model by retraining it on all the available data (full-history approach) and on only the newest data (sliding window approach). Moreover, we investigate whether a data change monitoring tool is capable of determining when the anomaly detection model needs to be updated through retraining.
- North America > United States > California > San Francisco County > San Francisco (0.28)
- North America > United States > District of Columbia > Washington (0.05)
- Europe > Netherlands > South Holland > Delft (0.04)
- (6 more...)
Safety Performance of Neural Networks in the Presence of Covariate Shift
Cheng, Chih-Hong, Ruess, Harald, Theodorou, Konstantinos
Covariate shift may impact the operational safety performance of neural networks. A re-evaluation of the safety performance, however, requires collecting new operational data and creating corresponding ground truth labels, which often is not possible during operation. We are therefore proposing to reshape the initial test set, as used for the safety performance evaluation prior to deployment, based on an approximation of the operational data. This approximation is obtained by observing and learning the distribution of activation patterns of neurons in the network during operation. The reshaped test set reflects the distribution of neuron activation values as observed during operation, and may therefore be used for re-evaluating safety performance in the presence of covariate shift. First, we derive conservative bounds on the values of neurons by applying finite binning and static dataflow analysis. Second, we formulate a mixed integer linear programming (MILP) constraint for constructing the minimum set of data points to be removed in the test set, such that the difference between the discretized test and operational distributions is bounded. We discuss potential benefits and limitations of this constraint-based approach based on our initial experience with an implemented research prototype.
Building a vision for real-time artificial intelligence
I recently had a conversation with a senior executive who had just landed at a new organization. He had been trying to gather new data insights but was frustrated at how long it was taking. After walking his executive team through the data hops, flows, integrations, and processing across different ingestion software, databases, and analytical platforms, they were shocked by the complexity of their current data architecture and technology stack. It was obvious that things had to change for the organization to be able to execute at speed in real time. Data is a key component when it comes to making accurate and timely recommendations and decisions in real time, particularly when organizations try to implement real-time artificial intelligence.
Iterative Assessment and Improvement of DNN Operational Accuracy
Guerriero, Antonio, Pietrantuono, Roberto, Russo, Stefano
Deep Neural Networks (DNN) are nowadays largely adopted in many application domains thanks to their human-like, or even superhuman, performance in specific tasks. However, due to unpredictable/unconsidered operating conditions, unexpected failures show up on field, making the performance of a DNN in operation very different from the one estimated prior to release. In the life cycle of DNN systems, the assessment of accuracy is typically addressed in two ways: offline, via sampling of operational inputs, or online, via pseudo-oracles. The former is considered more expensive due to the need for manual labeling of the sampled inputs. The latter is automatic but less accurate. We believe that emerging iterative industrial-strength life cycle models for Machine Learning systems, like MLOps, offer the possibility to leverage inputs observed in operation not only to provide faithful estimates of a DNN accuracy, but also to improve it through remodeling/retraining actions. We propose DAIC (DNN Assessment and Improvement Cycle), an approach which combines ''low-cost'' online pseudo-oracles and ''high-cost'' offline sampling techniques to estimate and improve the operational accuracy of a DNN in the iterations of its life cycle. Preliminary results show the benefits of combining the two approaches and integrating them in the DNN life cycle.
- North America > United States > California > San Francisco County > San Francisco (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > Italy (0.04)
- Asia > India (0.04)